Trinucleotide repeats in human genome and exome
نویسندگان
چکیده
Trinucleotide repeats (TNRs) are of interest in genetics because they are used as markers for tracing genotype-phenotype relations and because they are directly involved in numerous human genetic diseases. In this study, we searched the human genome reference sequence and annotated exons (exome) for the presence of uninterrupted triplet repeat tracts composed of six or more repeated units. A list of 32 448 TNRs and 878 TNR-containing genes was generated and is provided herein. We found that some triplet repeats, specifically CNG, are overrepresented, while CTT, ATC, AAC and AAT are underrepresented in exons. This observation suggests that the occurrence of TNRs in exons is not random, but undergoes positive or negative selective pressure. Additionally, TNR types strongly determine their localization in mRNA sections (ORF, UTRs). Most genes containing exon-overrepresented TNRs are associated with gene ontology-defined functions. Surprisingly, many groups of genes that contain TNR types coding for different homo-amino acid tracts associate with the same transcription-related GO categories. We propose that TNRs have potential to be functional genetic elements and that their variation may be involved in the regulation of many common phenotypes; as such, TNR polymorphisms should be considered a priority in association studies.
منابع مشابه
Comparative analysis of amino acid repeats in rodents and humans.
Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also mo...
متن کاملApproaches to genetic diagnosis in neuromuscular conditions in the era of next generation sequencing.
The diagnosis of neuromuscular disorders traditionally involved clinical and neurophysiological assessment, pathological evaluation of muscle and/or nerve biopsy and sequential testing of individual genes. Next generation sequencing (NGS) has revolutionised the diagnostic paradigm in genetic disorders, with the capability to capture and sequence genes, the entire exome (1% of the protein coding...
متن کاملof expanded glutamine repeats in neurodegeneration - - current situation and new possibilities
Tandem repeats, that is simple sequence repeats, occur commonly in the human genome, and they have long been used as markers in linkage studies. In this decade, it has also been found that tandem repeats underlie an entirely new class of human mutations. The expansion of a group of trinucleotide repeats is now known to cause several inherited diseases, all of which are neurological disorders. T...
متن کاملThe genome of herpes simplex virus type 1 is prone to form short repeat sequences
Herein, we report a very high content of simple sequence repeats (SSRs) covering 66.12% of the herpes simplex virus type 1 (HSV-1) genome when a low threshold is adopted to define SSRs, indicating that repeat sequence is a very important character of the HSV-1 genome. The repeats with two iterations account for 68.33% of the total repeats. In reality, the genome of HSV-1 is prone to form shorte...
متن کاملAsparagine Repeats in Plasmodium falciparum Proteins: Good for Nothing?
Malaria is a deadly parasitic human disease that poses a significant health risk for about 3.3 billion people in the tropical and subtropical regions of the world [1]. The past decade has seen significant progress in our understanding of the biology of the most deadly parasite species, Plasmodium falciparum. The groundwork for this progress was laid by genome sequencing efforts that revealed a ...
متن کامل